Topic-focus and Salience

نویسندگان

  • Eva Hajicová
  • Petr Sgall
چکیده

Most of the current work on corpus annotation is concentrated on morphemics, lexical semantics and sentence structure. However, it becomes more and more obvious that attention should and can be also paid to phenomena that reflect the links between a sentence and its context, i.e. the discourse anchoring of utterances. If conceived in this way, an annotated corpus can be used as a resource for linguistic research not only within the limits of the sentence, but also with regard to discourse patterns. Thus, the applications of the research to issues of information retrieval and extraction may be made more effective; also applications in new domains become feasible, be it to serve for inner linguistic (and literary) aims, such as text segmentation, specification of topics of parts of a discourse, or for other disciplines. These considerations have been a motivation for the tectogrammatical (i.e. underlying, see below) tagging done within the Prague Dependency Treebank (PDT) to contain also attributes concerning certain contextual features, i.e. the contextual anchoring of word tokens and their relationships to their coreferential antecedents. Along with this enrichment in the intersentential aspect, we do not neglect to pay attention to intrasentential issues, i.e. to sentence structure, which displays its own features oriented towards the contextual potential of the sentence, namely its topic-focus articulation (TFA). In the present paper, we give first an outline of the annotation scenario of the PDT (Section 2), concentrating then on the use of one of the PDT attributes for the specification of the Topic and the Focus (the 'information structure') of the sentence (Section 3). In Section 4. we present certain heuristics that partly are based on TFA and that allow for the specification of the degrees of salience in a discourse. The application of these heuristics is illustrated in Section 5.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Salience Theory and Pricing Stock of Corporates in Tehran Stock Exchange

How the investors react to the received information plays a crucial role in determining the return of stock exchange market. Supply and demand based upon incorrect decisions lead to the price deviation of inherent values. This paper aims to study the impact of salience phenomenon on disproportionate pricing and investor overreaction in the corporates in Tehran stock exchange. Research methodolo...

متن کامل

The Comparison of Computer Assisted Teaching and Traditional Explicit Method in Learning / Teaching English Vocabulary.

This review surveys research on second language vocabulary teaching and learning since1999. It first considers the distinction between incidental and intentional vocabulary learning.Although learners certainly acquire word knowledge incidentally while engaged in variouslanguage learning activities, more direct and systematic study of vocabulary is also required.There is a discussion of how word...

متن کامل

The Impact of Topic Characteristics and Threat on Willingness to Engage with Wikipedia Articles: Insights from Laboratory Experiments

A growing body of research aims to identify the factors that motivate people to make contributions in Wikipedia. We conducted two laboratory experiments to investigate the connections between topic characteristics, perception of threat, and willingness to engage with Wikipedia articles. In Study 1 (N = 83), we examined how topic familiarity, topic controversiality, and mortality salience influe...

متن کامل

Extra-Sentential Elements, Prosodic Restructuring, and Information Structure. A Study of Clitic-Left Dislocation in Spontaneous French

The aim of this study is to discuss the hypothesis according to which prosodic prominence and pragmatic salience are closely related. To do so, we focus on French clitic-left dislocated subjects, namely those constructions where the subject occupies a position in the left periphery and co-refers with a resumptive clitic pronoun within the main clause. Scholars have made the hypothesis that the ...

متن کامل

Salience Rank: Efficient Keyphrase Extraction with Topic Modeling

Topical PageRank (TPR) uses latent topic distribution inferred by Latent Dirichlet Allocation (LDA) to perform ranking of noun phrases extracted from documents. The ranking procedure consists of running PageRank K times, where K is the number of topics used in the LDA model. In this paper, we propose a modification of TPR, called Salience Rank. Salience Rank only needs to run PageRank once and ...

متن کامل

A review of text mining approaches and their function in discovering and extracting a topic

Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling.  Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001